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^ , Abstract 

I Minimum market transparency requirements impose Hedge Fund (HF) managers 

^ ■ to use the statement declared strategy in practice. However each declared strategy 

may actually origin a multiplicity of implemented management decisions. Is then the 
"actual "strategy the same as the "announced" strategy? Can the actual strategy 
be monitored or compared to the actual strategy of HF belonging to the same 
2 , " announced" class? Can the announced or actual strategy be used as a quantitative 

argument in the fund of funds policy? With the appropriate metric, it is possible to 
' draw a minimum spanning tree (MST) to emphasize the similarity structure that 

could be hidden in raw correlation matrix of HF returns. 

^ '. 
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. 5t , 1 Introduction 



Before confined to off shore destinations, Hedge Funds (HF) are managed portfolios under the 
least number of constraints. HF, enjoying more freedom than traditional institutional funds, 
could historically achieve better returns under the same market conditions: for example, short 
selling means the possibility to make profit out of a bear market. Investing in a HF is a trust 
contract between the investor and the fund manager's strategy. However complete freedom 
in managing investors money is hard to sell. In order to require a minimum transparency 
between the HF manager strategy and investors, national regulatory institutions ask the hedge 
funds to declare the adopted policy in the statement. Practice has shrank the "types" of style 
management in some, by now, well known classes. Why do we care about hedge fund type of 
style management? Because a growing market practice tends to classify HF as normal assets, 
hence as objects of a further investor portfolio optimization. Unawareness of HF peculiarities 
and biases may hide the risk of being fooled by randomness. For a nice introduction to HF 
consult e.g. [Lha02]. 
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Data on HF performances present more shortcomings than asset time series: HF Net Asset 
Value (NAV) are quoted just once a month. Give the highly proprietary character of applied 
strategies, which are jealously guarded, the only public data are about indexes grouping HF by 
style. These indices contain a set of disappearing and appearing HF, inducing an unavoidable 
"survivor bias". 

In this first paper we focus our attention on the characterization of the applied strategy, in 
particular our data set shows that strategies plays in HF a role similar to market sectors 
for traditional assets. Indeed wc apply combinatorial optimizing techniques to analyze those 
characteristics that cluster funds together and those that diversify them. Finally we show how 
this approach can be used to monitor and select HF investments highlighting those individual 
HF with a tendency to depart from their stated investment strategy. 



2 The Data 



We use single HF NAV time series of T synchronous observations, where: 

• Time: Jan-1999 - Jan-2003 of 49 monthly observations; T = 49 

• Assets: 

• 62 Funds with strategies reported in Table (see [Lha02] for strategies definition ). 

• 5 Market Indices: MIB30, DJ, HSI, NDQ, FTSE, SP500 



3 Noise and Signal in Correlation 

Rank reduction for correlation matrices is usually obtained via a standard zeroed-eigenvalues 
reduced-rank approximation [Bri01,RJ99]. As in [VPOl] we identify the significant eigenvalues 
of correlation matrix by selecting those which depart from the spectrum of a same size sym- 
metric random matrix. Random Matrix Theory (RMT) offers a way to clean the matrix from 
the random components [Met90]. The "cleaned" correlation matrix is then used to determine 
Euclidean distances between funds. 

We first normalize monthly returns for the all series, then compute the equal time cross- 
correlation N X N matrix C. Problems in measurement (see e.g. [VPOl]) are: (i) non station- 
arity of the matrix as market conditions change; (ii) the finite length of time series available to 
estimate cross correlations introduces "measurement noise". As T increases to avoid problem 
(ii), problem (i) increases. From both sources (i) and (ii) we get random contributions into the 
correlation matrix. 

Wc test the eigenvalues of the matrix C against the null hypothesis given by eigenvalues gen- 
erated by a same size symmetric random matrix. In the limit A^, T — > oo with a fixed radio 
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Q = T/N > 1, the spectral density is given by [SM99]: 



T \ (X+ - A) (A - A_) 1 [T 

^"•^^^^^2^ T ^ ^^^'^^^^^^Q^'^^Q 

The three highest eigenvalues of the correlation matrix spectrum for 62 HF and 5 market 
indices, is compared to the highest eigenvalue A+ given by eq: 1. To gain hindsight on the 
measurement accuracy for the three highest observed eigenvalues Ai, A2, A3 we have determined 
the bootstrapped distribution of their estimators. Since eq: 1 is valid for iV, T — > 00 s.t. 
Q = T /N > 1 is fixed, we test the finite size effect on Xmax = A+ determination. Results are 
reported on Fig: 1(a), showing that the three highes eigenvalues of the observed correlation 
matrix are not compatible with the Random Matrix Hypothesis. In the following we will use 
the rank- reduced matrix C obtained by zeroing all the eigenvalues lower than A3. Since zeroing 
eigenvalues has altered the diagonal we may consider C as a covariance matrix, the associated 
correlation matrix can be recovered as follow: 

n - 
^ij - 
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4 Minimum Spanning Tree: How to Visualize Dependencies 



We consider HF time series as points in Euclidean space. The method is based on a similarity 
concept expressed by distance: similarity grows as distance shrinks. A frequently used measure 
is the Euclidean distance between two normalized time series, that can be calculated starting 
from the linear correlation coefficient [Man99,RRV86]. 

2 n 

= Y: {Sik - Sj,f = 2 (1 - Cij) (2) 

k=l 




Given distances between all funds and by considering each fund as a node of a tree structure, it 
is possible to determine a data representation known as Minimum Spanning Tree (MST). It is 
the unique graph that connects all nodes with the minimum extension. The minimum spanning 
tree is obtainable through simple algorithms [PAOl]. Distance minimization between tree nodes 
allows for a natural HF classification in clusters containing elements with similar realizations. 

In our example, using return time series of 62 hedge funds in the period January 2000 to 
January 2003, we determined the associated MST (Fig. 1(b)). We note how the clustering 
process leads to an economic meaningful classification of strategies and how anomalies can 
be promptly detected. For some clusters (cona, gta/glm, emn, emm) there exist a fund that 
acts as a reference center (cona 43, cta51, emm30, ), while strategies as "Long-Short" do not 
show peculiar characteristics, coherently with heterogeneity and discretion in this management 
style. This observation may suggest that the less discretionary management policies are (for 
examples, those following mathematical models implemented by software decisions), the more 
similar corresponding returns are. 



3 



5 Conclusions 



We observe a broad coherence between the usual qualitative classification of HF and the phe- 
nomenological classification deduced from historical data of HF returns. Albeit managers dis- 
cretionality, macro strategies seems to share enough common points in the realized returns to 
be quantitatively classified. In a universe where transparency towards investors is low, where 
operative strategies are protected, data mining and classification instruments allow to extract 
maximum information from available data and to verify whether the declared strategics in 
the fund statement are verified. In particular, anomalous nodes with respect to the reference 
cluster, let identify funds that require a deeper investigation. Beyond qualitative control, these 
results are useful to maximize portfolio diversification, by selecting funds belonging to different 
clusters and by paying special attention to those characterizing central nodes. Moreover, those 
characteristics that identify a cluster help to define the actual benchmark for a given strategy. 
Finally, the tree structure offers an objective basis to extract economic conclusions, portfolio 
selection and control. By extracting the structure hidden in large correlation matrices, trees are 
easier to interpret than inspection of large correlation matrices! 
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Fig. 1. (a) Highest eigenvalues distribution compared to A+ given by RMT results for uncorrelated 
returns; (b) Minimum Spanning Tree for HF Strategies. 



5 



